Out-of-the-box, Python standard library ships with many great libraries some of which provide commandline interface (CLI), allowing us to do many cool things directly from terminal without needing to even open a .py
file.
This includes things like starting a webserver, opening a browser, parsing JSON files, benchmarking programs and many more, all of which we will explore in this article.
Benchmarking
The one Python module you likely already saw being used directly from commandline is timeit
. It can be used for simple benchmarking:
# In the simplest form:
python -m timeit "7 + 28"
50000000 loops, best of 5: 5.72 nsec per loop
# Creates "x" variable before running the test
python -m timeit -s "x = range(10000)" "sum(x)"
2000 loops, best of 5: 160 usec per loop
In many cases, single line of code is not enough to perform a test - if you need to do some setup, then you can use -s
option to - for example - create variables or import modules. This option can also be used multiple times.
As you can see from the 2 examples above, the number of loops can vary widely - in the first example we have 50,000,000 loops, while in the second one we have only 2000 loops - this is based on the time it takes to execute the code once. To force certain number of loops, you can use -n X
option. The output also says "best of 5", meaning that it runs the code in 5 loops of 2000 executions each, you can change that with -r X
option.
If you need a bit more powerful benchmarking tool, yet want the simplicity of timeit
, then you can try pyperf
:
pip install pyperf
# Uses "-l" and "-n" instead of "-n" and "-r" respectively
python -m pyperf timeit -s "x = range(10000)" "sum(x)" -o result.json
.....................
Mean +- std dev: 157 us +- 12 us
python -m pyperf stats result.json
Total duration: 13.4 sec
Start date: 2022-09-09 12:36:27
End date: 2022-09-09 12:36:42
...
0th percentile: 134 us (-9% of the mean) -- minimum
5th percentile: 135 us (-9% of the mean)
25th percentile: 137 us (-8% of the mean) -- Q1
50th percentile: 141 us (-4% of the mean) -- median
75th percentile: 161 us (+9% of the mean) -- Q3
95th percentile: 163 us (+10% of the mean)
100th percentile: 168 us (+14% of the mean) -- maximum
python3 -m pyperf hist result.json
python3 -m pyperf hist result.json --bins 10
133 us: 13 ############################################################
136 us: 14 #################################################################
139 us: 6 ############################
143 us: 1 #####
146 us: 0 |
150 us: 0 |
153 us: 0 |
156 us: 7 #################################
160 us: 17 ###############################################################################
163 us: 0 |
167 us: 2 #########
# To compare performance of different Python versions
python3.10 -m pyperf timeit -q --compare-to=python "str()"
Mean +- std dev: [python] 81.9 ns +- 4.5 ns -> [python3.10] 60.0 ns +- 1.9 ns: 1.36x faster (-27%)
pyperf
has very similar interface to basic timeit
, it even provides subcommand of the same name. This command will produce benchmarking stats (result.json
in the above example) which can be then inspected with stats
subcommand. You can also create histogram using hist
subcommand and the same file.
pyperf
can be also used to make performance comparisons between different versions of Python - this can be handy when evaluating improvements in newer Python versions.
As you can see, pyperf
has clear advantages over basic timeit
, if you want more reasons why you should use pyperf
, then see the comparison in pyperf
docs.
Finally, you can also use builtin cProfile
module, which provides tools for profiling of Python programs, simple invocation would look like:
python -m cProfile -s cumulative some-code.py
1058 function calls (1029 primitive calls) in 9.222 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
5/1 0.000 0.000 9.222 9.222 {built-in method builtins.exec}
1 0.000 0.000 9.222 9.222 some-code.py:1(<module>)
1 9.218 9.218 9.218 9.218 some-code.py:3(exp)
4/1 0.000 0.000 0.004 0.004 <frozen importlib._bootstrap>:986(_find_and_load)
4/1 0.000 0.000 0.004 0.004 <frozen importlib._bootstrap>:956(_find_and_load_unlocked)
4/1 0.000 0.000 0.004 0.004 <frozen importlib._bootstrap>:650(_load_unlocked)
3/1 0.000 0.000 0.003 0.003 <frozen importlib._bootstrap_external>:837(exec_module)
5/1 0.000 0.000 0.003 0.003 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
1 0.000 0.000 0.003 0.003 decimal.py:2(<module>)
4/2 0.000 0.000 0.003 0.002 <frozen importlib._bootstrap>:549(module_from_spec)
This test was run against a sample code that computes e
to the power of X
. This kind of profiling can be useful, but is quite limited. cProfile
only gives us information about function calls, not about individual lines of code - if you call some particular function such as append
in different places, then it will be all aggregated into single line in cProfile
output.
For a deep dive into Python profiling, see my previous article here.
Running Servers
Let's say you need a simple HTTP server for testing or want to serve files from a directory. Python's http.server
makes this super easy:
python -m http.server 8080 \
--directory /tmp/ \
--bind 127.0.0.1 # Defaults to all interfaces; supports IPv6
This is very convenient for testing, but it's definitely not production-ready. If you need a proper server for your Python web application, consider using WSGI server such as Gunicorn.
Also, if you don't mind writing a bit of code, then http.server
can be customized a bit more, see docs for examples and allowed methods/options.
If - instead of just serving files - you need full FTP server with upload, download, authentication and more, then pyftpdlib
will solve that for you:
pip install pyftpdlib
python -m pyftpdlib \
--port 2121 \
-d /tmp/ \
-u someuser \
-P somepass
# From another terminal:
ftp
ftp> open 0.0.0.0 2121
Connected to 0.0.0.0 (0.0.0.0).
220 pyftpdlib 1.5.6 ready.
Name (0.0.0.0:martin): someuser
331 Username ok, send password.
Password:
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> ls
227 Entering passive mode (127,0,0,1,192,201).
125 Data connection already open. Transfer starting.
...
Here we use pyftpdlib
to start a server listening on port 2121
, serving /tmp/
directory with user someuser
and password somepass
. We are then able to connect to it using ftp
Unix command.
pyftpdlib
should be sufficient in most cases, but if you're looking for alternatives, ten make sure to check out FTP server provided by Twisted framework. You can install the library using pip install Twisted
and run the server with twistd ftp ...
.
Debugging
While debugging from your IDE of choice is probably more convenient, Python provides tools for debugging directly in terminal. One such tool/module is pdb
.
There are multiple ways to use the pdb
module, first one being through python -i ...
option:
python -i script.py
Traceback (most recent call last):
File "script.py", line 4, in <module>
func()
File "script.py", line 2, in func
return 0 / 0
ZeroDivisionError: division by zero
>>> import pdb
>>> pdb.pm() # Post-mortem debugger
> /home/martin/.../script.py(2)func()
-> return 0 / 0
(Pdb) l # list surrounding lines
1 def func():
2 -> return 0 / 0
3
4 func()
[EOF]
(Pdb)
Here we tried running simple script that does return 0 / 0
, which obviously throws ZeroDivisionError
. Thanks to the -i
option, after the program exits we go into interactive REPL session where we can debug the program. We do that by importing pdb
and starting post-mortem debugger with pdb.pm()
. From there we can use pdb
commands to inspect the environment and to find out why our program crashed. In this example we use l(ist)
command which shows surrounding lines.
Another option is to execute the program with pdb
from the beginning:
python -m pdb -c "b 4" script.py
> /home/martin/.../script.py(1)<module>()
-> def func():
(Pdb) l
1 -> def func():
2 return 0 / 0
3
4 B func()
[EOF]
(Pdb)
This will break the execution on the first line, from there you can use pdb
commands to set breakpoints (b), step over (n), etc. We also used the -c
option to set a breakpoint on line 4, this is denoted by "B" after the line number.
pdb
is great, but it's not the only debugging tool Python provides - if you are invoking a Python script from a commandline and it's giving you very non-descriptive error, such as segfault
you can try the following:
python -c "import ctypes; ctypes.string_at(0)"
Segmentation fault (core dumped)
python -q -X faulthandler -c "import ctypes; ctypes.string_at(0)"
Fatal Python error: Segmentation fault
Current thread 0x00007f42c9602100 (most recent call first):
File "/usr/lib64/python3.8/ctypes/__init__.py", line 514 in string_at
File "<string>", line 1 in <module>
Segmentation fault (core dumped)
Executing import ctypes; ctypes.string_at(0)
thrown segmentation fault without any actual context. To get more info we use faulthandler
module which dumps traceback.
Finally, not just for debugging, but for general inspection of program execution, you can also use trace
module:
python -m trace --count -C cover --summary some-code.py
lines cov% module (path)
4 100% _distutils_hack.__init__ (/home/martin/.../_distutils_hack/__init__.py)
7 100% abc (/usr/lib64/python3.8/abc.py)
61 100% collections.__init__ (/usr/lib64/python3.8/collections/__init__.py)
5 100% decimal (/usr/lib64/python3.8/decimal.py)
128 100% numbers (/usr/lib64/python3.8/numbers.py)
13 100% some-code (some-code.py)
ls cover/
abc.cover collections.__init__.cover decimal.cover _distutils_hack.__init__.cover numbers.cover some-code.cover
cat cover/some-code.cover
1: from decimal import *
1: def exp(x):
1: getcontext().prec += 2
1: i, lasts, s, fact, num = 0, 0, 1, 1, 1
3645: while s != lasts:
3644: lasts = s
3644: i += 1
3644: fact *= i
3644: num *= x
3644: s += num / fact
1: getcontext().prec -= 2
1: return +s
1: exp(Decimal(3000))
In this example we use the sample program from the earlier profiling example, this time invoking it with the trace
module, supplying --count
option to count number of executions of each line, -C
for directory name (cover
) where we want the reports to be stored, as well as --summary
option to give us quick summary in stdout.
In the snippet, you can also see contents of cover/some-code.cover
, which shows how many times was each line in the file executed.
The trace
module provides couple more CLI option, for complete list see docs.
Parsing JSON
You surely know that Python standard library includes json
module, but did you know that it also provides commandline interface?
python -m json.tool --sort-keys --json-lines < data.jsonl
{
"id": 24,
"last_name": "Doe",
"name": "John",
"username": "johnd"
}
{
"id": 31,
"last_name": "Smith",
"name": "Ben",
"username": "bens"
}
This way you can validate and pretty-print JSON on commandline. This is especially useful if you're working on system that doesn't have more JSON processing tools, such as jq
.
Additionally, if you provide --json-lines
option, you can also parse jsonl
format - that is - a file containing multiple JSON objects.
Compression and Packaging
Python's standard library also includes a set modules for compressing/decompressing archives, such as zlib
, gzip
, bz2
or tarfile
. Some of these also provide commandline interface:
python -m gzip -h
usage: gzip.py [-h] [--fast | --best | -d] [file [file ...]]
...
ls -l words*
-rw-rw-r--. 1 martin martin 37 Aug 23 2021 words1.txt
-rw-rw-r--. 1 martin martin 47 Aug 23 2021 words2.txt
# Use fast compression on files prefixed with "words"
python -m gzip --fast words*
# Files named "words*.txt.gz" was generated:
ls -l words*
-rw-rw-r--. 1 martin martin 37 Aug 23 2021 words1.txt
-rw-rw-r--. 1 martin martin 68 Sep 8 15:00 words1.txt.gz
-rw-rw-r--. 1 martin martin 47 Aug 23 2021 words2.txt
-rw-rw-r--. 1 martin martin 76 Sep 8 15:00 words2.txt.gz
In this example we have 2 files called words1.txt
and words2.txt
. We use gzip
module to compress these individual files. Disadvantage of gzip
is, that it's able to compress only individual files and not directories. If we want to create an archive from a directory we have to use other modules:
python -m zipfile -c archive.zip words1.txt words2.txt # Create
ls -l
-rw-rw-r-- 1 martin martin 300 sep 11 12:51 archive.zip
-rw-rw-r-- 1 martin martin 37 sep 9 17:18 words1.txt
-rw-rw-r-- 1 martin martin 47 sep 9 17:18 words2.txt
python -m zipfile -t archive.zip # Test
Done testing
python -m zipfile -e archive.zip /tmp # Extract
ls /tmp/words*
/tmp/words1.txt /tmp/words2.txt
Here we use zipfile
module to create, inspect, and extract files from archive. You can also use tarfile
instead, which provides identical interface.
Also related to archiving - Python provides zipapp
module which can create ZIP files of Python code, which can be then executed by interpreter:
cat ./app/hello.py
def say_hello():
print("Hello World!")
tree
.
└── app
└── hello.py
python -m zipapp app/ \
-p "/usr/bin/env python3" \
-m "hello:say_hello"
./app.pyz
Hello World!
Here we have directory named app
containing hello.py
file with say_hello
function. To create executable archive from this directory, we use python -m zipapp
, specifying the directory, interpreter path (-p
), which also makes the file executable, and entrypoint (second -m
), which is the name of file and the function. This outputs app.pyz
which can be directly executed.
This module can be used as a simple way of distributing Python application to users. All that user needs is compatible Python interpreter.
Beyond this simple example, it's also possible to create a standalone application with its dependencies packaged - for a complete walk-through of the module see tutorial at RealPython.
Starting a Web Browser
We know that Python comes with "batteries included", but even then, from time-to-time you discover modules or tools that are little unexpected in language's standard library. One such module is - in my opinion - webbrowser
:
python -m webbrowser -n "https://www.python.org"
This module allows you to open a browser window (-n
) or a tab (-t
) (with specified URL) directly from terminal or from your Python code.
The CLI will always open only default web browser, for non-default browser use:
import webbrowser
url = "https://docs.python.org/"
wb = webbrowser.get("google-chrome %s")
wb.open_new(url)
Here, in google-chrome %s
, the %s
gets substituted with the URL, and gets invoked as if it was called in terminal. Therefore, the command - google-chrome
in this case - has to be available in your shell.
Python-Powered Shell
Python comes with a lot of great tools, but there are many more tools out there. Great example of that is Xonsh - a Python-powered shell:
python -m pip install 'xonsh[full]'
# or
docker run -it --rm xonsh/xonsh:slim
...
Welcome to the xonsh shell (0.9.24)
~ Python-powered, cross-platform, Unix-gazing shell ~
...
<user>@02ba210f5006 / $ print("hello")
hello
<user>@02ba210f5006 / $ for i in range(5):
... print(i)
0
1
2
3
4
<user>@02ba210f5006 / $ ls
bin boot dev etc home lib lib64 media ...
<user>@02ba210f5006 / $ pwd
/
It combines the simplicity of Unix shell (bash), with the power of Python - allowing us to run both commands like ls
or pwd
shown above, as well as any Python code directly in terminal, which makes it somewhat similar to IPython, but suited more for shell scripting.
Closing Thoughts
Python is great at many things - including providing some CLI tools for common tasks. Almost all of these are part of Python's standard library, which provides many more modules and libraries that can help you write better code or simplify common tasks. Finding all the useful stuff is really just matter of reading the docs, and I strongly recommend reading through the library index - I'm positive that you will find something useful that you didn't know was there.